Comparison of Grapheme-to-Phoneme Conversion Methods on a Myanmar Pronunciation Dictionary
نویسندگان
چکیده
Grapheme-to-Phoneme (G2P) conversion is the task of predicting the pronunciation of a word given its graphemic or written form. It is a highly important part of both automatic speech recognition (ASR) and text-to-speech (TTS) systems. In this paper, we evaluate seven G2P conversion approaches: Adaptive Regularization of Weight Vectors (AROW) based structured learning (S-AROW), Conditional Random Field (CRF), Joint-sequence models (JSM), phrase-based statistical machine translation (PBSMT), Recurrent Neural Network (RNN), Support Vector Machine (SVM) based point-wise classification, Weighted Finite-state Transducers (WFST) on a manually tagged Myanmar phoneme dictionary. The G2P bootstrapping experimental results were measured with both automatic phoneme error rate (PER) calculation and also manual checking in terms of voiced/unvoiced, tones, consonant and vowel errors. The result shows that CRF, PBSMT and WFST approaches are the best performing methods for G2P conversion on Myanmar language.
منابع مشابه
Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach
To achieve high quality output speech synthesis systems, data-driven grapheme-to-phoneme (G2P) conversion is usually used to generate the phonetic transcription of out-of-vocabulary (OOV) words. To improve the performance of G2P conversion, this paper deals with the problem of conflicting phonemes, where an input grapheme can, in the same context, produce many possible output phonemes at the sa...
متن کاملGrapheme to phoneme conversion and dictionary verification using graphonemes
We present a novel data-driven language independent approach for grapheme to phoneme conversion, which achieves a phoneme error rate of 3.68% and a pronunciation error rate of 17.13% for English. We apply our stochastic model to the task of dictionary verification and conclude that it is able to detect spurious entries, which can then be examined and corrected by a human expert.
متن کاملBi-directional conversion between graphemes and phonemes using a joint N-gram model
We present in this paper a statistical model for languageindependent bi-directional conversion between spelling and pronunciation, based on joint grapheme/phoneme units extracted from automatically aligned data. The model is evaluated on spelling-to-pronunciation and pronunciation-tospelling conversion on the NetTalk database and the CMU dictionary. We also study the effect of including lexical...
متن کاملFurther Improvements to Pronunciation by Analogy
The synthesis quality is influenced by many important factors, among which the correctness of the grapheme-to-phoneme conversion is one of the crucial ones. The globalization phenomenon makes it impossible to have a dictionary with all of the existing words for each language. Automatic letter-tosound systems have been in the center of attention for the last decade. One of the most effective and...
متن کاملComparison of Grapheme-to-Phoneme Methods on Large Pronunciation Dictionaries and LVCSR Tasks
Grapheme-to-Phoneme conversion (G2P) is usually used within every state-of-the-art ASR system to generalize beyond a fixed set of words. Although the performance is typically already quite good (< 10% phoneme error rate) and pronunciations of important words are checked by a linguist, further improvements are still desirable, especially for end user customization. In this work, we present and c...
متن کامل